35 research outputs found
La polysémie réguliÚre dans WordNet
International audienceThis paper presents an analysis and modeling of polysemy in the WordNet English lexical database. It exploits the concepts hierarchy (constituted by synsets), and the gloss defining each of these concepts. The result consists of rules set which enabled us to identify in a largely automated way, with a precision close to 91%, more than 2100 synsets pairs, connected by a regular polysemy relation. Our method also allows a partial word sense disambiguation of the definition associated with these synsets.Cette étude propose une analyse et une modélisation des relations de polysémie dans le lexique électronique anglais WordNet. Elle exploite pour cela la hiérarchie des concepts (représentés par des synsets), et la définition associée à chacun de ces concepts. Le résultat est constitué d'un ensemble de rÚgles qui nous ont permis d'identifier d'une façon largement automatisée, avec une précision voisine de 91%, plus de 2100 paires de synsets liés par une relation de polysémie réguliÚre. Notre méthode permet aussi une désambiguïsation lexicale partielle des mots de la définition associée à ces synsets
From the Definitions of the "Trésor de la Langue Française" To a Semantic Database of the French Language
International audienceThe Definiens project aims at building a database of French lexical semantics that is formal and structured enough to allow for a fine-grained semantic access to the French lexiconâfor such tasks as automatic extraction and computation. To achieve this in a relatively short time, we process the definitions of the TrĂ©sor de la Langue Française informatisĂ© (TLFi), enriching them with an XML tagging that makes explicit their internal organization (roughly, genus and differentiae) and enhancing the components with semantic labels that explicit their role in the definition. There is, to our knowledge, no existing broad coverage database for the French lexicon that offers to researchers and NLP developers a structured decomposition of the meaning of lexical units. Definiens is an ongoing research that will hopefully fill this gap in the near future
Improvement of VerbNet-like resources by frame typing
International audienceVerbenet is a French lexicon developed by " translation " of its English counterpart â VerbNet (Kipper-Schuler, 2005) â and treatment of the specificities of French syntax (Pradet et al., 2014; Danlos et al., 2016). One difficulty encountered in its development springs from the fact that the list of (potentially numerous) frames has no internal organization. This paper proposes a type system for frames that shows whether two frames are variants of a given alternation. Frame typing facilitates coherence checking of the resource in a " virtuous circle ". We present the principles underlying a program we developed and used to automatically type frames in Verbenet. We also show that our system is portable to other languages
Building a lexicon of French deverbal nouns from a semantically annotated corpus
International audienceThe ongoing project Nomage aims at describing the aspectual properties of deverbal nouns in an empirical way. It is centered on the development of two resources: a semantically annotated corpus of deverbal nouns, and an electronic lexicon. They are both presented in this paper, and emphasize how the semantic annotations of the corpus allow the lexicographic description of deverbal nouns to be validated, in particular their polysemy
Dictionary-Ontology Cross-Enrichment Using TLFi and WOLF to enrich one another
International audienceIt has been known since Ide and Veronis that it is impossible to automatically extract an ontology structure from a dictionary, because that information is simply not present. We at- tempt to extract structure elements from a dictionary using clues taken from a formal ontology, and use these elements to match dictionary definitions to ontology synsets; this allows us to enrich the ontology with dictionary definitions, assign ontological structure to the dictionary, and disambiguate elements of definitions and synsets
Un Verbenet du français
International audienceVerbNet is a lexical resource for English verbs that has proven useful for NLP thanks to its high lexical and syntactic coverage and its systematic coding of thematic roles. Such a resource doesnât exist for French. This has motivated us to develop a Verbenet for French. We present how we have developed Verbenet from VerbNet while using as far as possible the available lexical resources for French, and how the various French alternations are coded, focusing on differences with English (existence of pronominal forms, for example). This paper should allow an NLP researcher to use Verbenet in a simple and efficient way for a task such as semantic role labeling.VerbNet est une ressource lexicale pour les verbes anglais qui est largement utilisĂ©e en TAL du fait de sa bonne couverture lexicale et syntaxique et de son encodage systĂ©matique des rĂŽles thĂ©matiques. Aucune ressource Ă©quivalente n'existe pour le français, ce qui nous a motivĂ©s pour dĂ©velopper un Verb@net du français. Nous prĂ©sentons comment nous avons dĂ©veloppĂ© Verb@net Ă partir de VerbNet tout en utilisant au maximum les ressources lexicales existantes du français, et comment sont encodĂ©es les diffĂ©rentes alternances du français en mettant l'accent sur les diffĂ©rences avec l'anglais (l'existence de formes pronominales, par exemple). Cet article devrait permettre Ă un chercheur en TAL une utilisation simple et efficace de Verb@net pour une tĂąche comme l'annotation en rĂŽles sĂ©mantiques
Regular Polysemy in WordNet
International audienceThe importance of describing regular polysemy in a lexicon has often been outlined, especially in the field of natural language processing (for a good overview of this issue, see (Ravin and Leacock, 2000)). Unfortunately, no existing broad-coverage semantic lexicon has been built following this relatively recent advice. And since producing a broad coverage semantic lexicon is a very time-consuming task, one has to put this idea into practice on existing lexicons. WordNet is an appropriate lexical semantic resource for running this experiment as it is machine readable and has a wide coverage (Fellbaum, 1998). In this paper, we introduce a method to create regular polysemy patterns from WordNet data and to automatically detect their occurrences in the lexicon
Improvement of VerbNet-like resources by frame typing
International audienceVerbenet is a French lexicon developed by " translation " of its English counterpart â VerbNet (Kipper-Schuler, 2005) â and treatment of the specificities of French syntax (Pradet et al., 2014; Danlos et al., 2016). One difficulty encountered in its development springs from the fact that the list of (potentially numerous) frames has no internal organization. This paper proposes a type system for frames that shows whether two frames are variants of a given alternation. Frame typing facilitates coherence checking of the resource in a " virtuous circle ". We present the principles underlying a program we developed and used to automatically type frames in Verbenet. We also show that our system is portable to other languages
De la simplicité en morphologie
International audienceLes unitĂ©s lexicales simples ne font que rarement lâobjet dâĂ©tudes en morphologie dĂ©rivationnelle, bien quâelles soient le point de dĂ©part des mĂ©canismes qui la sous-tendent. Nous nous proposons dans cet article dâĂ©laborer et dâanalyser un important corpus de noms simples du français. Lâobjectif est de vĂ©rifier lâhypothĂšse de Croft (1991) selon laquelle les noms simples dĂ©notent prototypiquement des objets, contrairement aux noms construits, qui renvoient principalement Ă des actions ou Ă des propriĂ©tĂ©s, selon quâils sont dĂ©rivĂ©s de verbes ou dâadjectifs. Nous constituons dans un premier temps un corpus de noms simples, ce qui prĂ©suppose de dĂ©finir cette notion, plus problĂ©matique quâelle ne le paraĂźt dâemblĂ©e. Nous proposons dans un second temps une annotation sĂ©mantique des quelque 3500 noms simples retenus, en noms dâobjet, dâaction ou de propriĂ©tĂ©, et nous dĂ©taillons la sĂ©rie de tests linguistiques employĂ©s pour Ă©tablir cette classification. Il ressort de notre Ă©tude quâenviron un quart des noms simples ne dĂ©notent pas (ou pas uniquement) des objets. Certains dâentre eux relĂšvent de classes nominales plus spĂ©cifiques, non rĂ©ductibles aux trois catĂ©gories initialement considĂ©rĂ©es
Developing a French FrameNet: Methodology and First results
International audienceThe Asfalda project aims to develop a French corpus with frame-based semantic annotations and automatic tools for shallow semantic analysis. We present the ïŹrst part of the project: focusing on a set of notional domains, we delimited a subset of English frames, adapted them to French data when necessary, and developed the corresponding French lexicon. We believe that working domain by domain helped us to enforce the coherence of the resulting resource, and also has the advantage that, though the number of frames is limited (around a hundred), we obtain full coverage within a given domain